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ABSTRACT 


In  this  paper  we  consider  the  problem  of  robust  estimation  of  the  scale  of  the 
location  residuals  when  the  underlying  distribution  of  the  data  belongs  to  a  contam¬ 
ination  neighborhood  of  a  parametric  location-scale  faimily.  We  define  the  class  of 
M-estimates  of  scale  with  general  location,  and  show  that  under  certain  regularity  as¬ 
sumptions,  these  scale  estimates  converge  to  their  asymptotic  functionals  uniformly 
with  respect  to  the  underlying  distribution,  and  with  respect  to  the  M-estimate 
defining  score  function  x-  We  establish  expressions  for  the  maximum  asymptotic 
bias  of  M-estimates  of  scale  over  the  contamination  neighborhood  as  a  function  of 
the  fraiction  of  contamination.  Using  these  expressions  we  construct  asymptotically 
min-max  bias  robust  estimates  of  scale.  In  p^ticular,  we  show  that  a  scaled  version 
of  the  Madm  (median  of  absolute  residu<ds  about  the  median)  is  approximately  min- 
max  bias-robust  within  the  class  of  Huber’s  proposal  2  joint  estimates  of  location 
and  scale.  We  also  consider  the  larger  class  of  M-estimates  of  scale  with  general 
location,  and  show  that  a  scaled  version  of  the  Shorth  (the  shortest  half  of  the  data) 
is  approximately  min-max  bias  robust  in  this  class.  Finally,  we  present  the  results 
of  a  Monte  Carlo  study  showing  that  the  Shorth  has  attractive  finite  sample  size 
mean  squared  error  properties  for  contaminated  Gaussian  data. 

The  August  1991  Technical  Report  No.  214  form  of  this  paper  is  a  considerably 
revised  and  extended  version  of  the  October,  1989  Technical  Report  N.  184. 
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1.  E^TRODUCTION 


A  main  theoretical  approach  to  robustness  has  consisted  of  studying  the  asymp¬ 
totic  behavior  of  an  estimate  when  the  underlying  distribution  of  the  data  belongs 
to  some  neighborhood  (e.g.  e- contamination  or  Levy  neighborhood)  of  a  paramet¬ 
ric  model.  In  this  context  one  tries  to  obtain  estimates  which  optimize  some  ap¬ 
pealing  criterion,  e.g.,  minimize  the  maximum  asymptotic  variance  over  a  given 
neighborhood.  Huber  (1964)  is  the  earliest  example  of  this  approach,  with  focus  on 
M-estimates  of  location. 

The  best  known  part  of  Huber  (1964)  is  the  result  that  a  particular  M-estimate 
of  location,  namely  the  one  with  psi-function  V>(x)  =  min{c,  max(ar,  — c)},  minimizes 
the  maximum 'asymptotic  variance  over  symmetric  c-contamination  neighborhoods 
of  a  Gaussian  model.  A  considerably  less  well  known  part  of  Huber  (1964)  is  that 
concerned  with  asymptotic  bias  of  location  estimates  for  unrestricted  asymmetric 
c-contamination  neighborhoods  of  a  nominal  Gaussian  model:  among  all  translation 
equivariant  estimates,  the  median  minimizes  the  maximum  asymptotic  bias  over 
such  neighborhoods.  The  relevance  of  this  result  seems  considerable  in  view  of  the 
needed  realism  of  allowing  asymmetric  contamination. 

Recently  there  has  been  a  renewed  interest  in  bias-robustness.  In  particular 
Donoho  and  Liu  (1988)  have  shown  that  minimum  distance  estimates  have  desir¬ 
able  bias  robustness  properties.  Martin,  Yohai  and  Zamar  (1989)  have  obtained 
asymptotically  minimax  bias  regression  estimates,  and  Martin  and  Zamar  (1989) 
have  obtained  minimax  bias  estimates  of  scale  for  positive  random  variables. 

In  this  paper  we  obtain  minimax  bias  robust  estimates  of  scale  for  contamination 
models  with  a  nominal  distribution  which  is  symmetric  about  an  unknown  location 
parameter.  More  precisely,  we  assume  that 

(AO)  Fo  is  a  specified  distribution  function  with  an  even  and  unimodal  density  /q. 

The  distribution  F  for  independent  and  identically  distributed  observations  Xi, . . . ,  Xn 
belongs  to  the  c-contaminated  family 

F’l  =  |f(x)  :  F{x)  =  (1  -  e)Fo  (J  x  e  R  ,  e  fixed  in  (0,  .5)| ,  (1) 

where  po  is  the  unknown  location  parameter,  sq  is  the  unknown  scale  parameter  and 
H  is  an  arbitrary  (and  unspecified)  distribution. 

The  first  step  in  obtaining  a  minimax  estimate  is  to  derive  the  maximal  asymp¬ 
totic  bias  5t(c)  of  an  estimate  T  over  the  family  .F(.  From  Brie)  one  may  construct 
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a  maximum  bias  curve,  namely  a  plot  of  Bt(c)  versus  c.  The  maximum  bias  curve 
includes  the  gross  error  sensitivity  GESt,  namely  the  slope  of  5x(c)  at  c  =  0,  and 
also  the  breakdown-point  e^,  which  is  the  location  of  the  singularity  where  Br{t) 
goes  to  infinity.  While  the  two-number  summary  consisting  of  the  GESj  and  Cj 
provides  considerable  information,  one  naturally  would  like  to  have  the  entire  curve 
Bt{€)  if  possible.  Not  only  do  such  curves  allow  one  to  check  the  range  of  accuracy 
of  the  GESt  as  a  linear  approximation,  they  may  also  lead  to  different  preference 
ordering  of  competing  estimates  that  one  might  make  on  the  basis  of  GESt  an<l 
(e.g.  see  Section  5  and  also  Martin,  Yohai  and  Zamar,  1989,  who  find  min-max  bias 
robust  regression  estimates  with  GESt  =  oo). 

Figure  1  displays  the  maximum  bias  curves  for  three  proposed  robust  estimates 
of  scale:  H95,  a  Huber  proposal  2  estimate  of  scale,  adjusted  for  95%  efficiency 
at  the  Gaussian  model  (Huber,  1964);  the  median  of  absolute  deviations  about 
the  median  (Madm);  and  the  “shortest  halP  of  the  data  (Shorth).  Observe  that 
^shortH  —  ^Madm  =  *^he  largest  possible  value  of  c*  and  c^gs  =  .17.  The  breakdown 
point  of  a  classical  Gaussian  maximum  likelihood  estimate  is  typically  zero.  The 
GESt  lines  provide  local  linear  approximations  to  the  maximum  bias,  which  are 
reasonable  for  not  too  large  values  of  e  (just  how  large  the  reader  can  judge  for 
himself  -  see  the  rule  of  thumb  in  Hampel  et.  ai.  1986). 

In  this  section  we  show  that,  under  certain  regularity  conditions,  the  finite  sample 
value  and  the  asymptotic  value  of  robust  M-scale  estimates  are  uniformly  close,  as 
F  ranges  over  the  family  ft.  Moreover,  prior  results  in  Martin  and  Zamar  (1989) 
indicate  that  the  bias  is  a  significant  component  of  the  mean-squ^lred  error  for  rather 
small  to  moderate  sample  sizes,  depending  on  the  value  of  c. 

The  remainder  of  the  paper  is  organized  as  follows.  Section  2  introduces  class  of 
M-estimates  of  scale  with  genered  location.  This  class  includes  the  well  known  Huber 
(Proposal  2)  M-estimates  of  location  and  scale,  and  also  the  class  of  scale  estimates 
called  S-estimates,  which  are  aissociated  with  so  called  S-estimates  of  regression 
(Rousseeuw  and  Yohai,  1984).  Section  2  also  shows  that,  under  certain  regularity 
conditions,  the  finite  sample  value  and  the  asymptotic  value  of  M-estimates  of  scale 
are  uniformly  close,  as  F  ranges  over  the  family  !F(.  Moreover,  prior  r'^jults  in 
Martin  and  Zamar  (1989)  indicate  that  the  bias  is  a  significant  component  of  the 
mean-squared  error  for  rather  small  to  moderate  sample  sizes,  depending  on  the 
value  of  e.  Section  3  gives  a  class  of  generalized  bias  functions  to  deal  with  the 
intrinsic  asymmetry  of  the  bias  of  scale  estimates.  Section  4  constructs  minimax 
bias-robust  estimates  for  the  class  of  Huber  (Proposal  2)  M-estimates  of  location 
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and  scale,  and  shows  that  the  bias  robust  estimates  are  well  approximated  by  the 
(Madm).  Section  5  constructs  minimax  bias- robust  S-estimates  of  scale,  which  are 
shown  to  be  min-max  in  the  larger  class  of  M-estimates  of  scale  with  general  location 
introduced  in  Section  2.  Section  5  also  shows  that  these  estimates  are  reasonably  well 
approximated  by  the  Shorth.  Section  6  briefly  discusses  the  difference  between  bias- 
robust  Huber  estimates  ^L^d  S-estimates.  Sections  7  and  8  give  some  encouraging 
finite  sample  results.  Finally,  Section  9  closes  with  a  brief  discussion  of  the  GES 
linear  approximation  to  the  maximum  bias  curve.  Proofs  of  lemmas  and  theorems 
are  given  in  Section  10. 

Our  results  on  the  Shorth  complement  recent  results  of  Rousseeuw  and  Leroy 
(1988),  who  propose  the  Shorth  as  a  robust  scale  estimate.  They  derive  the  influ¬ 
ence  function,  the  finite  sample  breakdown- point,  and  a  correction  factor  to  achieve 
approximate  finite  sample  size  unbiasedness  at  the  normal  distribution.  Another 
interesting  recent  work  on  the  Shorth  is  that  of  Grubel  (1988),  who  establishes 
asymptotic  normality. 

2.  M-ESTIMATES  OF  SCALE  WITH  GENERAL  LOCATION 

Estimates  of  scale  are  conveniently  viewed  as  translation  invariant,  scale  equivari- 
ant  functionals  S{F)  defined  over  a  subset  F  of  distribution  functions  F,  which  is  as¬ 
sumed  to  include  all  the  empirical  distribution  functions  F„  and  the  e-contamination 
family  (1).  The  scale  estimate  s„  is  then  obtained  by  evaluating  the  functional  S{F) 
at  F,:  s„  =  S{Fn). 

Suppose  that 

(Al)  X  w  even,  nondecreasing  on  [0,  oo),  bounded,  with  at  most  a  finite  number  of 

jumps,  and  that  x(oo)  =  1- 

Let 

6(x)  =  EfoX(^) 

and  for  each  t  €  R,  let  S{F,  t)  be  the  M-estimate  of  scale  of  X  —  t  defined  by 

S{F,  t)  =  sup  {s  :  Efx((^  -  0/-»]  >  b{x)}  ■  (2) 

We  also  assume  that 

(A2)  e  <  6(x)  <  1  —  e,  for  a  fixed  value  of  t  €  (0,0.5). 


4 


In  view  of  (2),  (Al)  and  (A2),  there  is  no  loss  of  generality  in  the  assumption  that 
x(oo)  =  1. 


The  definition  (2)  is  needed  to  insure  uniqueness  and  to  handle  possible  discon¬ 
tinuities  of  F  and  X-  If  X  (or  F)  is  continuous,  then  S{F,  t)  satisfies  the  equation 

E;.x[(^-0/5(F,0]  =  6(x).  (3) 

Since  the  location  pMameter  fto  in  (1)  is  unknown,  it  must  be  estimated  along 
with  Sq.  Let  T{F)  be  a  location  and  scale  equivariant  functional,  that  is, 

T[F{{x-t)/s)]=sT[F{x)]  +  t,  V  fe/2  ,  V  s>0. 

The  M-estimate  of  scale  with  general  location  is  now  defined  as 

S{F)  =  S[F,T{F)]. 


Some  particular  cases  are: 

Huber  Proposal  2.  In  this  case  T{F)  and  5[F,  T{F)]  simultaneously  satisfy  (3),  with 
t  =  T{F),  and 


Ef01{ A-  -  T{F))/S{F,  T(F))]  =  0,  (1) 


where 


(A3)  t^(x)  is  odd,  nondecreasing,  bounded,  with  at  most  a  finite  number  of  jumps. 
In  particular  the  Madm  is  obtained  when  x  is  the  jump  function 


Xa{x) 


■I 


0  if  |x|  <  a 

1  if  |z|  >  a. 


(5) 


with  a  —  Fo'‘*(3/4),  and  V’  is  the  “sign”  function 

iPq{x)  =  - 

In  this  case  T{F)  =  F~‘(l/2)  is  the  median  of  F. 


-1 

0 

1 


if  z  <  0 
if  z  =  0 
if  z  >  0. 


(6) 
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S-Estimate  of  Scale.  In  this  case  the  location  estimate  T{F)  is  a  minimizer  of  S{F,  t), 
that  is, 

S{F)  =  MS{F,t).  (7) 

It  is  not  difficult  to  see  that  5(F)  and  T{F)  satisfies  (3)  and  (4)  with  tp{x)  =  x^(a;)- 
Since  is  bounded,  0(a:)  tends  to  zero  as  x  tends  to  infinity,  that  is,  rp{x)  is 
redescending.  In  particular,  the  Shorth  is  obtained  when  x  is  given  by  (6).  Observe 
that  the  Madm  and  the  Shorth  have  both  the  same  chi-function  (6)  but  different 
centering  functionals. 

The  following  lemma  shows  that  under  mild  assumptions  the  breakdown  point  of 
the  the  functional  5(F,  t)  is  larger  than  c.  The  proof  is  straightforward  and  therefore 
omitted. 

LEMMA  1.  Let  F  >  0  be  given  and  suppose  that  (AO),  (Al)  and  (A2)  hold.  Then, 
there  exist  0  <  Si  <  S2  <  oo  such  that  sj  <  5(F,t)  <  sj  for  all  |t|  <  K  and 
FeF,. 

Theorem  1  below  shows  that,  under  some  regularity  conditions  which  include  the 
continuity  of  x,  5(Fn)  -♦  5(F)  a.s.(F]  as  n  oo,  uniformly  over  x  C,  where  C 
is  a  certain  class  of  x-functions.  Unfortunately,  the  case  of  x-functions  of  the  jump 
type  given  by  (5)  is  not  covered  by  Theorem  1.  However,  Theorem  6  and  the  Monte 
Carlo  results  presented  in  sections  7  and  8  support  the  finite  seunple  relevance  of  the 
asymptotic  minimax-bias  theory  for  this  important  special  Ceise. 

The  following  definitions  are  needed  for  stating  Theorem  1. 

9^{3,t)  =  FfoxKA"  -  0/5]  .  =  {9/ds)g^{s,t).  (8) 

THEOREM  1.  Suppose  that  (A0)-(A2)  hold.  Assume  also  that  x  and  h^{s,t)  are 
continuous  and  that  /i^(s,  <)  <  0  for  all  s  >  0,  t  €  R.  Let  F  >  0  be  fixed.  Then,  for 
all  ^  >  0, 

(a)  lim„,_oo  sup^g;^,  Pp  {sup„>„,  sup|,,<;f  |5(F„,  t)  -  5(F,  f)!  >  =  0. 

(b)  If  5(F)  is  given  by  (6)  then  lim„,^oo  supfg^.  Pf  {sup„>„,  15(F„)  -  5(F)|  >  ^}  = 

0. 

(c)  If  supfg^,  |7’(F)|  <  00  and 
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then 


=  0. 

(d)  Let  xo  >  0  be  fixed.  The  class  C  is  defined  as  the  set  of  x-functions  satisfying  all 
the  previous  assumptions  and  (i)  x(a:)  =  1  V  |x|  >  xq  and  (ii)  there  exists  ho{s,t) 
such  that  <  ho{3,t)  <0,  V  s  >  0,  V  f  €  i?. 

Then  (a),  (b)  and  (c)  hold  uniformly  on  C. 

Remark.  Suppose  that  a  certain  function  xo  satisfies  (Al)  and  (A2)  and  is  such 
that 

/ixo(s,<)  = -• [xo  <0,  Vs>0,f€i2  and  X(>(a;)  =  1,  V  |x|  >  xq. 

Then  the  set 

=  {x  :  x'{x)  <  Xo(^)  ,  V  X  >  0  and  h^{s,t)  =  -^^Fo  [x'  (~~)  ““] } 
satisfies  the  assumptions  of  Theorem  1. 


lim  sup  Pf  ^  sup  |5[F„.  r(F„)]  -  5[F,  r(F)]|  >  6 
"‘~*°°FeF,  Ln>m  J 


3.  GENERALIZED  BIAS 


Although  the  M-estimates  of  scale  with  general  location  introduced  in  Section 
2  are  Fisher  consistent  at  the  nominal  distribution  Fq,  they  are  in  general  asymp¬ 
totically  biased  for  F  ^  Furthermore,  the  “raw”  asymptotic  bias  Rr[5(F)]  = 
5(F)  —  So  can  be  of  two  distinct  kinds:  When  F  is  an  outliers  generating  distribu¬ 
tion,  the  bias  Fr[5(F)]  is  positive,  and  when  F  is  an  inliers  generating  distribution, 
the  bias  J3r[5(F)]  is  negative. 

As  in  Martin  and  Zameur  (1989),  we  consider  generalized  bi^  functions  which 
are  scale  invariant  and  flexible.  Penalization  of  positive  and  negative  bias  is  inde¬ 
pendently  chosen,  by  allowing  the  user  to  put  positive  and  negative  bias  on  different 
scales.  Specifically,  we  define  the  generalized  bias 


Bl5(f)l  = 


ii(5(/’)/3o| 

L2[S{F)I^] 


if  0  <  5(F)  <  So 
if  So  <  5(F)  <  oo 


(9) 


where  Li  and  Lt  are  continuous,  non-negative  and  monotone,  with  Li(l)  =  1/2(1)  = 
0  and 

limZ,i(f)  =  lim  1-2(0  =  oo- 

t—O  f— oo 
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We  are  interested  in  the  maximum  generalized  bias, 

B(e)  =  n«xB(S(f)).  (10) 

iFrom  monotonicity  of  and  Z,2,  it  follows  that  =  max  {Li[5~/so]  ,  L2[5‘''/so]}, 

where  S~  «ind  S'*"  denote  the  supremum  and  the  infimum  of  the  functional  S{F)  as 
F  ranges  over  F’t- 

4.  BIAS  ROBUST  HUBER  ESTIMATES 

In  view  of  the  historical  importance  and  high  degree  of  familiarity  of  Huber 
(Proposal  2)  estimates  we  first  focus  on  obtaining  bias  robust  estimates  in  this 
class.  To  emphasize  the  dependence  on  x  and  ^  we  use  the  notation  5(F,  x,  V’), 
5+(x,0),  etc. 

The  first  step  toward  finding  the  bias  robust  Huber  estimate  is  deriving  the 
expressions  (16)  and  (17)  for  S~{x,T^)  and  5"*‘(x»T^)-  Claims  which  are  made 
below  without  proof  can  be  easily  verified  under  (A0)-(A3). 

The  maximum  value  S'^{x,  0)  of  the  scale  functional  5(F,  x,  0)  is  produced  by  a 
point  mass  contamination  at  infinity,  Soai  and  such  contamination  also  produces  the 
maximum  value  of  the  location  estimate  r,^(F).  The  estimating  equations  in  this 
limit  case  are 

(l-f)EF,xl(A- -()/»)  +  £  =  6(x)  (11) 

and 

(l-e)EFo#A:-f)/s]  +  6  =  0.  (12) 

Let  7x(0  be  the  unique  solution  of  (11)  for  fixed  t  and  let  r^{3)  be  the  unique 
solution  of  (12)  for  fixed  s  >  0.  The  function  =  r^[7x(0]  continuous  and 

non-decreasing  .  Also,  the  pair  (s*,<')  simultaneously  satisfy  (11)  and  (12)  if  and 
only  if  f*  =  and  s*  =  7x(^*)- 

The  following  lemma  characterizes  the  maximum  asymptotic  bias  due  to  outliers 
of  the  location  and  scale  Huber  M-estimates.  This  lemma  also  provides  an  algorithm 
for  computing  these  maximum  biases.  We  recall  that  Huber  (1964)  heis  shown  that 
the  maximum  asymptotic  bias  of  the  median  (the  minimax-bias  estimate  of  location) 
is 


fo  =  Fo-‘(.5/(l-c)l. 


(13) 
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LEMMA  2.  Suppose  that  (A0)-(A3)  hold.  For  each  n  >  1  let  =  mx,v,(<n-i),  with 
<0  given  by  (13).  Let  =  7x(^n)  and  t*  =  inf{t  >  <o  '•  *)•  Then,  (a) 

lim„_oo  linVi— oo  •Sn  =  7x(^*)  =  •**!  i^)  maximum  asymptotic  bias  of 

the  location  estimate  T{F,  Xi  i’)  is  <*  and  the  maximum  value  of  the  scale  functional 
5(F,x,0)  is  5+(x,V’)  =  s*. 

The  minimum  value  of  the  scale  functional  5(F,  x,t^),  S~{x,i’),  is  produced  by 
a  point  ma.ss  contamination  at  zero.  In  this  cast  the  estimating  equations  are 

(1  -  c)Ef,x[(X  -  tys]  +  exit/s)  =  bix)  (14) 

cind 

(1  -  e)EFMX  -  t)/s]  +  erl>{-t/3)  =  0.  (15) 

By  monotonicity  of  t  =  0  for  all  s  >  0.  Let  be  the  inverse  of  gx{s,t)  with 
respect  to  s,  for  fixed  t.  Then,  from  (14)  with  <  =  0  it  follows  that 

5"(x,0)  =  5o ‘[^X)/(1  -  e)]  (16) 


Optimal  Centering 

The  choice  of  4)  has  an  effect  on  the  maximum  asymptotic  bias  of  the  scale 
estimate  by  virtue  of  affecting  the  bias  t*  of  the  location  estimate.  Observe  that 
since  5“(x,0)  doesn’t  depend  on  V’  (see  (16)),  the  optimal  choice  of  must  be 
based  S'^{x^'4’)  alone. 

It  follows  from  Lemma  2  and  (11),  with  t  =  that 

S+(x.  ■/>)  =  <;, --'l(i'-')/(i-£)l-  (17) 

Since  for  all  0  <  a  <  1  the  function  gr^ici)  is  non- decreasing  in  t.  Therefore,  by  the 
Huber  (1964)  minimax-bi<is  result,  r^(s)  >  to  =  r^{s),  for  all  s  >  0  and  r/i,  where 
00  is  the  “sign”  function  (6).  Thus  we  have  the  following  result: 

THEOREM  2.  For  each  fixed  x  satisfying  (Al),  the  median  centering  functional 
minimizes  the  maximum  asymptotic  bias  of  both  location  and  scale  among  Huber 
estimates  with  0  satisfying  (A3). 
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More  generaUy,  it  is  not  difficult  to  show  that  Theorem  2  holds  for  the  class  of 
all  M-estimates  of  scale  with  centering  functional  T{F)  having  the  “monotonicity” 
property 


TiF)  <  r[(l  -  e)Fo  +  eS^],  VF  €  F,. 


(18) 


The  Minimax-Bias  Huber  Estimate  of  Scale 

By  Theorem  2  it  suffices  to  consider  S'*'{x,‘ipo)  and  5~(x,V’o)  and  the  function  ip 
can  be  dropped  from  the  notations.  It  will  be  shown  that  under  certain  conditions 
the  maximum  generalized  bias  B{x)  (see  (10))  is  minimized  by  a  jump  function  Xo* 
(see  (6)).  ^  _ 

For  each  a  >  0,  let  B(a)  =  B(xa)  and 


6(a)  =  6(xa)  =  2(1  -  Fo(a)].  (19) 

We  begin  by  showing  that  given  0  <  e  <  .5,  Fo,  Li  and  L2  there  exists  a*  such  that 

^(a*)  <  ^Xa)  ,  V  a  >  0.  (20) 


Let  ao  =  Fo“'[(l  +  €)/2]  and  oi  =  Fo~‘((2  —  e)/2].  i,From  (19),  the  corresponding 
values  of  6  are  6^  =  6(ao)  =  1  —  c  and  61  =  b{ai)  =  c.  Hence,  letting  S~{a)  =  5“(Xa) 
and  5’‘^(a)  =  5'^(Xa),  we  have 


lim  S~{a)  =  lim 

a— *ao  o— •ao  ^ 


-I 


6(a) 


1  -  c 


=  lim  — Fn  * 

a— 00  a 


1  - 


6(a) 


2(1 -e)J  ao 


1 


=  f  Fo-^(.5)  =  0.  (21) 


and 


lirn  5'*’(a)  =  Jim 


1 


6(a)  -  e 


1  -c 


>  lim  Qn 

a— ►aj 


-1 


6(c)  -  e 


1  -e 


lim  -Fg  ‘ 
a— ai  a 


1  - 


6(a)  -  e 


2(1 -e)J  a, 


=  -Fg-‘(l)  =  +oo  (22) 


Therefore,  by  the  assumptions  on  Li  and  L^,  B{a)  — ♦  -foo  when  either  a  — >  og  or 
a  — »  ai-  By  continuity  of  5(a)  there  exists  ao  <  a*  <  ai  such  that 


5(a*)  <  5(a)  ,  V  ao  <  a*  <  ai 


(23) 
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Thus,  the  jump  function  Xa*  is  bias  robust  among  <ill  jump  functions  Xo-  The 
following  theorem  gives  conditions  under  which  S{F,Xa’i'>J’o)  is  bias  robust  among 
all  Huber  estimates  of  scale. 

THEOREM  3.  Let  s'  =  5‘^(a*),  where  a*  is  given  by  (23),  and  let  to  be  as  in  (13). 
Suppose  that  in  addition  to  (AO)-(Al),  the  following  conditions  hold: 

(1)  Jo{x)  >0,  V  X  €  and  /o(sx)//o(x)  is  increasing  in  |i|,  V  0  <  s  <  1. 

(2)  The  function  ko{x)  =  [fo{s'x  —  to)  +  fo{s'x  +  to)]//o(a:)  is  decreasing  in  lx]. 

(3)  S~{a)  and  S'*'{a)  are  both  strictly  monotone  at  a  =  a*. 

Then  5(a*)  <  for  all  pair  (XitA)  satisfying  (A1)-(A3). 

It  can  be  shown  that  the  conditions  of  Theorem  3  hold,  for  example,  when  Fq  is 
the  standard  normal  distribution  and  c  <  .35  (see  Martin  and  Zaimeir,  1987). 

Near  Optimality  of  the  Madm 

Let  6*  be  the  value  of  5(xa»)  =  EFoXa»{x)-  Since  the  bias  robust  estimate  of 
Theorem  3  is  based  on  Xa*i  using  the  median  for  centering,  it  follows  that  the  bias 
robust  Huber  estimate  is  the  n  —  [n6*]  order  statistic  of  the  absolute  value  of  the 
residuals  about  the  median  (scaled  by  1/a*),  where  ao(c)  <  a*  <  ai(c).  Since  both, 
ao(c)  and  ai(c)  tend  to  Fo^(.75)  as  c  — +  .5,  so  does  a*.  Thus,  as  e  — »  .5,  the  bias 
robust  Huber  estimate  is  the  well  known  Madm,  whose  breakdown-point  is  equal  to 
.5. 

It  came  as  a  pleasant  surprise  that  for  a  broiwl  range  of  c,  the  maximum  bias  of  the 
bias  robust  estimate  is  very  close  to  the  Madm  for  the  leading  case  of  the  nominal 
Gaussian  distribution  and  the  logarithmic  loss  function  L^it)  =  —Li(t)  =  log(t). 
Table  1  shows  the  values  of  a*,  6*  =  6(a*),  the  minimaix  bias  'B{a*)  and  the  maximum 
bias  B{Madm]  of  the  Madm  for  some  values  of  c.  The  value  of  a  for  the  Madm  is 
.674.  Therefore  in  this  case  there  is  no  appreciable  difference  between  the  Madm 
and  the  bias  robust  estimates.  Note  in  particular  that  even  when  we  choose  e  small, 
e.g.  e  =  .05,  the  breakdown-point  of  the  minimax-bieis  scale  estimate  is  very  close 
to  .5. 

5.  BIAS  ROBUST  S-ESTIMATES 

One  naturally  wonders  whether  greater  biais  robustness  can  be  obtained  by  en¬ 
larging  the  class  of  estimates  over  which  one  search  for  a  minimax  solution.  In 
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c 

a* 

*(»•) 

Bia-) 

B{MADM) 

0.05 

0.650 

0.516 

0.062 

0.10 

0.674 

0.500 

0.135 

0.15 

0.673 

0.501 

0.221 

0.20 

0.673 

0.501 

0.324 

0.30 

0.676 

0.499 

0.609 

0.612 

0.40 

0.695 

0.487 

1.072 

1.166 

0.45 

0.713 

0.476 

1.440 

1.779 

Table  1.  Bitis-robust  Huber  proposal  2  estimates  of  scale  when  Fo  =  standard 

normal.  Logarithmic  loss  function. 


particular  one  may  consider  the  entire  class  of  M-estimates  of  scale  with  general 
location.  This  larger  class  of  course  includes  joint  M-estimates  of  location  and  scale 
with  redescending  as  well  as  monotone  for  the  location  estimate. 

As  a  first  step  in  dealing  with  this  problem,  we  show  that  it  suffices  to  restrict 
attention  to  the  smaller  class  of  S-estimates  of  scale. 

The  following  notation  is  needed  for  stating  Theorem  4.  Let  5‘^(x)  and  5~(x) 
denote  the  maximum  and  minimum  asymptotic  values  of  the  S-estimate  of  scale 
based  on  x  (see  (7)).  Let  S'^{x,  T)  and  5”(x,  T)  denote  the  maximum  and  minimum 
asymptotic  values  of  the  M-estimate  of  scale  S^IF,  r(F)]  with  general  location,  based 
on  X  and  the  location  estimate  T{F). 


THEOREM  A.  Suppose  that  (A0)-(A2)  hold  and  let  7x(5)  =  where 

is  given  by  (8).  Let  T  be  any  location-scale  equivariant  estimate  satisfying  the 

condition 


T[(l  —  €)Fo  eSo]  —  0. 


Then, 

(a)  S^ix)  =  7;‘[(1  -  -  e)]  <  5+(x,T) 

(b)  5-(x)  =  7x-‘(V(i-0]  =  5-(x,r). 


This  paves  the  way  for  the  following  main  result. 
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THEOREM  5.  Suppose  that  (A0)-(A2)  hold.  Then  there  exists  a  jump  function  Xa- 
such  that  the  S-estimate  based  on  Xa*  fhe  minimax  asymptotic  bias  over  the 
class  of  all  M-estimates  of  scale  with  general  location. 


Therefore,  the  minimax  estimate  is  an  S-estimate  baised  on  the  jump  function 

Xa* 


Near  Optimality  of  the  Shorth 

As  in  Section  4,  a*  Fo“*(.75)  as  c  -+  .5.  Thus  the  minimax  estimate  of  scale 
with  generail  location  tends  to  the  Shorth  bls  e  .5.  Table  2  shows  the  values 
of  a*, 6*  =  b{a*),  the  minimax  bias  ^(a*)  and  the  maiximum  bias  ~E{Shorth)  of 
the  Shorth  for  some  values  of  c.  These  results  show  that  the  minimax  estimate  is 
reasonably  well  approximated  by  the  Shorth  in  terms  of  maximum  bias,  the  approx¬ 
imation  being  less  good  for  larger  values  of  c.  One  again  finds  that  a  breakdown 
point  reasonably  close  to  .5  is  obtained  by  the  minimax  estimate  for  a  wide  range 
of  values  of  c. 


c 

a* 

6(a*) 

Ma-) 

B{Shorth) 

0.05 

0.650 

0.516 

0.060 

0.060 

0.10 

0.700 

0.484 

0.127 

0.135 

0.15 

0.716 

0.474 

0.201 

0.220 

0.20 

0.726 

0.468 

0.284 

0.322 

0.30 

0.751 

0.453 

0.495 

0.612 

0.40 

0.763 

0.445 

0.845 

1.166 

0.45 

0.746 

0.456 

1.236 

1.779 

Table  2.  Bias-robust  M-estimates  of  scale  with  general  location  when  Fq  = 
standard  normal.  Logarithmic  loss  function. 


It  should  be  remarked  that  the  S-estimate  of  location  associated  with  the  Shorth, 
namely  the  midpoint  of  the  shortest  half  of  the  data,  has  a  slow  rate  of  convergence 
(Andrews  et  al.,  1972).  However,  the  Shorth  estimate  of  scale  hcis  the  usual  rate  of 
convergence  (Grubel,  1988). 
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6.  HUBER  ESTIMATES  VERSUS  S-ESTIMATES  OF  SCALE:  MADM 
VERSUS  SHORTH 


The  class  of  Huber  estimates  of  scale  considered  in  Section  3  excludes  centering 
functionals  which  are  M-estimates  of  location  with  redescending  ip.  These  location 
estimates  are  of  course  allowed  in  the  larger  class  considered  in  Section  5.  We  now 
show  that  the  S-estimate  of  location  T^iF)  is  in  fact  an  M-estimate  of  location  with 
redescending  psi-function  ip{x)  =  x^(^)-  Let  t*  =  iirgminj5(F,  f).  The  monotonicity 
of  x(i)  on  [0,  oo)  and  the  definition  of  the  S-estimate  of  scale  S{F)  (see  (7))  implies 
that 

1(7)  -  slTl)  ^  ~S(7Y  ^  ^ 

So,  t*  minimizes  the  function  l{t)  =  E/rx  [X  —  t/S{F)]  and  therefore  satisfies  the 
equation  /'(t*)  =  0,  that  is,  satisfies  the  location  M-estimate  equation 

l'{t)  =  EFx'[X-r/S{F)]  =  0. 

Figures  2a  and  2b  display  the  maximum  bias  curves  of  the  minimax  Huber  and  S- 
estimates  of  scale  (for  the  case  of  logarithnndc  loss  function)  for  outliers  and  inliers, 
respectively.  The  logarithmic  bias  for  the  Madm  and  the  Shorth  are  also  shown. 
Figure  2  reveals  uniformly  smaller  bias  for  the  minimax  S-estimate  than  for  the 
minimax  Huber  estimate. 

We  notice  that  in  Figure  2a  the  maximum  bias  curve  for  the  Shorth  is  uniformly 
smaller  than  that  of  the  minimax  S-estimate,  whereas  the  opposite  is  true  in  Figure 
2b.  This  is  a  consequence  of  the  relative  way  in  which  the  logarithmic  loss  function 
penalizes  positive  and  negative  bias.  It  is  worth  noticing  that  if  one  is  concerned 
only  about  outliers,  then  the  Shorth  is  the  best  choice  with  respect  to  bias.  The 
better  performance  of  S-estimates  relative  to  the  Huber  M-estimates  in  the  case  of 
outliers  is  a  consequence  of  the  S-estimate  of  location  being  an  M-estimate  with 
redescending  ip,  which  suffers  no  biais  for  gross  outliers. 

Also  referring  back  to  Figure  1  we  would  remeirk  that  the  price  paid  for  using  a 
high  efficiency  Huber  estimate  is  in  terms  of  maximum  bias  and  breakdown  point. 
Table  3  presents  mean-squared-error  relative  efficiencies  of  the  Shorth  relative  to 
Madm  for  finite  sample  sizes  n  =  20,40,100,  computed  by  Monte  Carlo  simula¬ 
tion.  These  results  indicate  considerably  superiority  of  the  Shorth  for  outliers,  and 
moderate  superiority  of  Madm  for  inliers. 
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eps 

n  = 

Outliers 

20 

Inliers 

n  = 

Outliers 

40 

Inliers 

n  = 

Outliers 

100 

Inliers 

ijj^ 

1.09 

1.10 

1.10 

1.18 

1.18 

1.11 

1.11 

1.06 

1.12 

1.06 

mm 

1.07 

1.00 

1.02 

1.07 

■9 

0.15 

1.10 

0.98 

1.04 

0.91 

1.12 

0.83 

0.20 

1.10 

0.89 

1.14 

0.82 

1.27 

0.77 

0.25 

1.23 

0.85 

1.27 

0.76 

1.39 

0.75 

0.30 

1.43 

0.83 

1.46 

0.77 

1.61 

0.78 

0.35 

1.62 

0.82 

1.66 

0.77 

1.83 

0.77 

0.40' 

1.74 

0.85 

1.86 

0.80 

2.01 

0.78 

0.35 

1.62 

0.82 

1.66 

0.77 

1.83 

0.77 

0.40 

1.74 

0.85 

1.86 

0.80 

2.01 

0.78 

0.45 

1.87 

0.92 

2.00 

0.85 

2.16 

0.82 

Table  3.  Mean-squared-error  relative  efficiencies  of  SHORTH  and  MADM. 


7.  FINITE  SAMPLE  RELEVANCE  OF  ASYMPTOTIC  BIAS  ROBUST¬ 
NESS 

Unfortunately,  the  functions  Xa  are  discontinuous  and  so  Theorem  1  cannot  be 
invoked  to  claim  finite  sample  relevance  for  the  asymptotic  minimax  theory.  However 
we  can  prove  the  following  result,  which  is  relevant  to  the  finite  sample  size  situation. 


THEOREM  6.  Let  0  <  a  <  oo.  For  each  A  >  0 

lim  sup  Ff  (5“(a)  —  A  <  5x,(F„)  >  5'*’(a)  +  A  ,  V  n  >  mj  =  1. 

m— *00  K  ) 


So  5“  (a)  —  A  and  S'*" (a)  -|-  A  are  almost  sure  uniform  lower  and  upper  bounds 
for  the  maximum  and  the  minimum  values  of  the  S-estimate  of  scale,  for 

m  large  enough. 

The  Monte  Carlo  results  summarized  on  figures  3  and  4  suggest  that  the  required 
values  of  m  are  moderately  small.  These  figures  display  the  finite  sample  bias 
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(logarithmic  loss)  for  several  contamination  models  for  the  Shorth  and  for  the  Madm, 
cis  well  as  the  corresponding  maximum  bias  curves.  Observe  that  for  both  cases,  for 
outliers  and  for  inliers,  the  asymptotic  maximum  bias  curves  tend  to  be  rather  close 
to  the  finite  sample  bias  curves. 

8.  FINITE  SAMPLE  COMPARISON  WITH  OTHER  ESTIMATES 

A  Monte  Carlo  simulation  study  was  carried  out  to  compare  the  bi<is  and  mean- 
squared-error  (MSE)  performance  of  the  following  scale  estimates:  the  minimax- 
bias  scale  estimates,  the  Madm,  the  Shorth,  the  A-estimate  of  scale  discussed  by 
Lax  (1985),  and  the  rejection-plus-st2uidard-deviation  (with  a  =  .01)  discussed  by 
Simonoff  (1987).  Some  results  for  saimple  size  n  =  20  are  presented  on  Figure  5 
(MSE)  and  Figure  6  (bias),  for  the  case  =  A^(0, 1)  and  logarithmic  loss.  Each 
simulated  sample  contains  exactly  e20  outliers  generated  from  the  four  different 
distributions  indicated  at  the  tops  of  the  figures.  Similar  results  (not  presented  here) 
were  obtained  for  n  =  40,  n  =  100  and  for  other  type  of  contaminating  distributions. 
The  main  conclusions  are:  (1)  when  c  <  .10  the  four  estimates  perform  equally  well; 
(2)  for  larger  fractions  of  outliers  the  Shorth  and  the  Madm  usually  outperform 
the  other  two  estimates,  with  the  Shorth  being  somewhat  better;  and  (3)  when  the 
outliers  are  large  and  well  separated  from  the  rest  of  the  data,  e.g.,  generated  from 
a  A^(10, 1),  the  rejection-plus-standard-deviation  estimate  performs  better  than  the 
other  three  estimates. 

9.  THE  GES  APPROXIMATION 

Hampel  et  al.  (1986)  established  that  based  on  the  gross-error  sensitivity,  the 
Madm  is  the  most  bias  robust  M-estimate  of  scale  for  vanishingly  small  fractions  of 
contamination  c.  In  fact  the  Shorth  has  the  same  influence  function  and  hence  the 
same  gross-error- sensitivity  as  the  Madm,  namely  0.787  (see  Leroy  and  Rousseeuw, 
1988).  However,  this  leaves  unanswered  the  question  of  optimality  for  each  e  € 
(0,  .5),  and  our  results  show  that  the  Shorth  is  a  better  estimate  than  the  Madm 
from  the  global  (i.e.  c  >  0)  point  of  view. 

On  the  other  hand,  it  must  be  noted  that  the  gross-error-sensitivity  approxi¬ 
mation  is  remarkably  good  for  c  <  .05,  with  the  approximation  being  better  the 
more  bias-robustness  the  estimate  possesses.  This  provides  substantial  reconfirma¬ 
tion  of  the  utility  of  the  influence  curve  and  the  gross-error-sensitivity  cis  a  measure 
of  mciximal  bias. 
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At  the  same  time  one  should  be  aware  that  the  gross-error-sensitivity  linear 
approximation  may  be  less  accurate  for  problems  with  nuisance  parameters.  For 
example,  in  the  present  context,  the  GES  approximation  to  the  maximal  bias  curve 
of  the  Madm  does  not  reflects  the  impact  of  the  bias  of  estimation  of  the  nuisance 
location  parameter.  Since  the  maximum  asymptotic  bias  of  the  Shorth  is  unaffected 
by  the  asymptotic  bias  of  the  location  estimate,  the  GES  approximation  is  better 
in  this  case. 

10.  PROOFS  OF  LEMMAS  AND  THEOREMS 

The  following  lemma  is  needed  to  prove  Theorem  1. 

LEMMA  3.  Let  0  <  Si  <  S2  <  oo  be  as  in  Lemma  1.  Suppose  that  (A-0)-(A-2)  hold 
and  also  assume  that  x  and  /i^(s,t)  are  continuous  and  /i^(s,t)  <0,  V  s  >  0,  t  €  /?• 
Then,  for  adl  A'  >  0,  we  have: 

(a)  ,y[(^  —  OZ-s]  uniformly  continuous  on  (x,s,t)  €  /?x[si,S2]x[— A',  A']. 

(b)  S{F,t)  is  uniformly  continuous  on  <  €  R,  uniformly  on  F  €  F,. 

(c)  y[(i  —  t)/5(F,t)]  is  uniformly  continuous  on  (x,t)  €  Rx[—K,K],  uniformly  on 
F€F<. 


Proof.  Let  <5  >  0  be  fixed  and  let  B  =  [si,S2]  x  [-A",  A"].  Since  x(a;)  is  continuous, 
even,  monotone  on  [0,  oo)  and  bounded,  it  is  uniformly  continuous.  Let  Ai  >  0  be 
such  that  |x  —  x'\  <  A\  implies  |x(3;)  —  x(3^0l  <  Also,  since  lim|3,|..^oo 
there  exists  iq  >  0  such  that  |x|  >  Xq  and  [x'l  >  xq  imply  |x(^)  —  x(^OI  <  Let 
xi  >  0  be  such  that  if  [xj  >  Xi,  then  |(x  — f)/sl  >  Xq  for  all  (s,  t)  6  B.  So  x[(3^~0/^] 
is  uniformly  continuous  on  {x  :  |x|  >  xi}  x  B.  If  |x|  <  Xi  then,  <  Xi|j  — 

^1  +  I7I  “  tjl  follows.  To  prove  (b)  notice  that  the  assumptions  on  h.^{s,t) 

imply  that  min(,,t)gB  |/i^(s,f)|  >  0,  and  so  So  =  5(1  —  e)min(,,()gB  |/ix(s,f)|  >  0.  By 
(a)  there  exists  A  >  0  such  that  \t  —  P|  <  A,  [tj  <  K,  |f'|  <  K  imply 


|FfX 


X-t' 

_S{FJ)-S 


-  Epx 


X-t 

S{F,t)-S 


and  so,  using  the  Mean  Value  Theorem  and  EfYKA  — f)/s]  >  (1  — c)E/roX[("^ 
VF  G  F„ 


Ef\ 


X-t’ 

S{F,t)-6 


-  5(x)  >  EfX 


X-t 

S{FJ)-S 


S_o 

4 
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> 

> 


(1 


EpoX 


X-t 


[S{F,t)-6\ 


-  EpoX 


^o/4 


c  o  c 

6(1  -  c)  min  -  T  ^  >  0- 


Thus,  5(F,  t)  >  S{F,  t')  —  6  and  (b)  holds.  Finally,  (c)  follows  directly  from  (a)  and 
(b)  □ 


Proof  of  Theorem  1.  Let  6  >  0  be  fixed.  It  can  be  shown,  as  in  the  proof  of  Lemma 
3  (b),  that  there  exists  0  <  6o  <  1  such  that 


Epx 


X-t 


S{Fj)-8\ 


-6(x)>^o,  V  lt|<A', 


(24) 


For  all  7  >  0,  let  fi„(t,7)  =  {^  Ex  [slfeb]  “  ^x)  >  t}-  By  Lemma  3  (c)  there 
exist 

—  K  =  ti<ti<...<tm  =  K  such  that 


n  fl«(i.o)2'n  b„((j,<„/2), 

|t|<K-  i=r 


(25) 


for  all  n.  By  (25)  and  Berstein  inequality,  for  each  j  =  1, . . . ,  m  and  for  all  F  €  .Te, 


PF{Bt,{tj,6o/2)}<Pf 


S{F,t,)-6_ 


-  EfX 


X-t, 

SiF,ti)-6 


Ufa 
<  e~  1* 


By  (25)  and  (26), 

Pf{inn5(F„,()-S(f.()l>-4  >  BfIoi  t|</f^n(^0)} 

(1*15^  J 

J=1 


for  all  F  €  Ff  Analogously,  we  can  show  that 

Pf  \  sup  [5(F„,  t)  -  S{F,  <)I  <  >  1  -  V  F  €  F,. 

[W<K  J 


Therefore, 


sup  15(F„,<)-5(F,01>^ 

\t\<K 


< 


2m 


1  -  e-«oVi2  ’ 


VF€F 


.(26) 
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and  (a)  follows  by  the  Borel-Cantelli  lenuna.  To  prove  (b)  first  notice  that,  since 
b{x)  <  (1  —  c),  there  exists  A"!  >  0  such  that  for  all  |t|  >  Ki, 

[1^1  ^  (^)  2  (^)  > 
where  si  is  as  in  Lemma  1.  Notice  that  by  the  Dominated  Convergence  Theorem 

lim  Ef.xI(X  -  <)/*.]  =  1. 

t— *00 

Hence,  S(F,t)  >  5(F,  0),  V  |<|  >  Ki,  'i  F  €  F'l  and  so  S{F)  =  inft^R  S{F,t)  = 
inf|t|<Ki  On  the  other  hand,  let  K3  and  (5i  >  0  be  such  that  (1  — c)Pfo(lX|  < 

^3)  >  Kx)  +  ^1-  Observe  now  that 

lim  inf  x  (“ Tt)  <  ^3)  =  /(|x|  <  F3),  V  i  €  A, 

Aa— >00  |t|>/Ca  VSj  +<5/ 

where  /(|i|  <  K3)  =  1  if  |x|  <  K3  and  equal  to  zero  otherwise.  Hence,  by  the 
Dominated  Convergence  Theorem 

lim  Ek,  (,  inf  X  (^)  /(Ia:|  <  a:,)}  =  (1  -  <  K,)  >  Hx)  + 

ffa— 00  \Sl+0/  J 

Therefore,  there  exist  K2  >  f(\  such  that 

>  +  (27) 

Let  62  =  minj^o, <^1  }/^2.  By  (a),  (27)  and  Berstein  inequality, 

Pf  (5(F„)  =  ,  inf  5(F„,o|  >  PfI,  jnf  S(Fn,t)  >  5(F„,0)|  > 
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Therefore, 


Pr{\SiFr.)-SiF)\<S}  >  Pf\  snp  \S{F^J)-S{F,t)\<6\  + 

Pf|5(F„)  ^  ^  mf^5(F,f)|  >  1  -  e-”T  V  F  € 
for  some  7  >  0  and  (b)  follows.  Since 

Pf  {|Slf„,T(F„)]  -  S[F,r(f )]  ><}  <  PfI  sup  |S(F„,()  -  S(F,()|  >  + 

[\t\<2K  2J 

FF||SIF,r(F„))-SIF,T(F)l|>||, 

(c)  follows  from  (a)  and  Lemma  3  (c). 

Finally,  (d)  follows  by  noticing  that,  under  the  given  assumptions,  all  the  statements 
made  in  the  proof  of  (a),  (b)  and  (c)  hold  uniformly  for  all  x  in  C  □ 

Proof  of  Lemma  2.  Since  the  median  minimizes  the  maximum  asymptotic  bias 
among  location  equivariant  estimates  (Huber,  1964),  and  since  to  and  U  are  the  max¬ 
imum  2isymptotic  biases  of  the  median  and  a  location  equivajizmt  estimate,  to  <  ti. 
Thus,  ti  =  m(fo)  <  m(fi)  =  <2  and  in  general,  <  <„+!•  Let  <**  =  Since 

r*  =  lirrin^ootn+i  =  limn-.oorn{tn)  =  m{limn^<x>tn)  =  we  have  t*”  >  t*.  On 

the  other  hand,  if  t  satisfies  t  =  m{t)  >  to  then  t  =  m{t)  >  for  all  n.  Therefore 
t  >  r*  and  so  P*  <  P.  The  second  part  of  (a)  follows  directly  from  the  continuity 
of  7(0-  To  prove  (b)  observe  that  P  is  a  lower  bound  for  the  maximum  bieis  of  the 
Huber  estimate  of  location.  This  lower  bound  is  achieved  if  the  estimate  is  computed 
by  the  recursion  formula  t„+i  =  m(f„),  starting  from  the  median  □ 

For  each  6  €  («,  1  —  e),  let  be  the  class  of  x-functions  satisfying  (Al)  and 
b{x)  =  b.  Also  let  C  be  the  class  of  functions  satisfying  (Al)  and  (A2),  that  is, 
C  =  Uj<4<i_,C4.  The  following  lemma  is  needed  to  prove  Theorem  3. 

Lemma  4:  Fix  b  €  (e,  1  —  e)  and  let  a  =  F(f'[l  —  (&/2)]-  Under  the  assumptions 
of  Theorem  3  we  have:  (a)  5~(Xa)  >  ‘5’”(x)  for  all  x  6  Cj;  and  (b)  gxis*,to)  > 
<o)  for  all  X  €  C(,  □ 

Proof.  Part  (a)  follows  directly  from  (16)  and  Lemma  A3  in  Martin  and  Zamar 
( 1989).  To  prove  (b),  notice  that  for  all  x  €  C(,  we  have  x(3^)/o(^)di  =  2  XT^Il  — 
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x(x)]/o(x)<fz.  Thus,3*  J^^xi^)[fo{s*x-to)+fo{s’x+tQ)]dx  =  s‘  f^^x{x)fo{x)ko{x)dx  > 
s‘kQ{a)Sl^x{x)fo{x)dx  =  25*A:o(a)  Xr[l-x(x)]/o(i)<ix  =  ■s*A:o(a)[/J^[l-x(x)|/o(x)<ii+ 
/a“[i  -  xix)]Mx)dx]  >  5*  fS^[l  -  x(x)]kQ(x)dx  +  /“[l  -  x(x)]A:o(x)/o(z)</z  and 

(b)  follows  □ 

Proof  of  Theorem  3.  First  of  all  notice  that  since  S'*" (a)  and  5~(a)  are  increasing  at 
a*  and  Li  and  are  strictly  monotone,  we  have  Z,i[5'*’(a)]  =  L2[5“(a)]  =  B{a’). 

Let  X  €  C  be  fixed  and  set  b  =  b{x)-  Let  a  =  Fo“^[l  —  {b/2)]  so  that  6(xa)  =  b.  If 
g^{s\to)  >  {b-e)/{l-€)  then  5+(x)  >  -s*.  SoB{x)  >  Z2[5+(x)]  >  12(3*)  =  B(a’). 

On  the  other  hand,  suppose  that  <  {^  —  c)/(l  —  fLat  is  5'''(x)  <  s’. 

Since  x  €  Cj,  by  Lemma  4  (b)  we  have  ^xa(^*5^o)  <  Px(^*»^o)  <  (b  —  e)/(l  —  e). 
Hence  5'''(a)  <  s’,  too.  In  view  of  the  optimality  of  Xo*  among  jump  functions  we 
have  B(a)  >  B(a’)  and  so  Li[5~(a)]  >  Z,i[5“{a*)].  For  the  particular  b  in  question, 
by  Lemma  4  (a),  5"(Xa)  >  •S'"(x)-  Therefore,  B(x)  >  Li[5~(x)]  >  Ii[5"(a)]  > 
Li[5“(a*)]  =  B(a’),  and  the  theorem  follows  □ 

Proof  of  Theorem  4.  Let  Foo  =  (1  -  e)Fo  +  eS^o,  too  -  T(Foo)  and  s^  =  Sy^{F^). 
First  notice  that 

h-‘[(6-e)/(l  -c)]  =  sup  5^(F,0)  =  5,(F.o,0),  (28) 

where  5x(F,  0)  is  the  S-scale  functional  based  on  x  and  the  true  location  0.  By 
definition  of  the  S-estimate  of  scale,  for  all  F  €  Ft,  S^{F)  =  infj  ^^((F,  t)  < 
inft5x(Foo,0  =  •5'x(^oo)0)i  and  so  S'^{x)  <  Sj^{F^,0).  Assume  first  that  s^  <  00 
and  so  <00  <  By  monotonicity  of  ^^(s,^),  b  =  Ep^xK-^  ~  ^oo)/5oo]  =  (1  — 
^)9x{^oo,too)  +  C  >  (1  -  e)^^(soo,0)  +  c  =  Ef„x(^/5oo).  Therefore, 

Sx(Foo,0)<s«,<5+(x)  (29) 

Observe  that,  if  Soo  =  00,  then  (29)  trivially  holds.  Now,  (28)  and  (29)  imply 
that  5'*'(x)  <  ^)/(f  ~  ^)]  ^  •5^(x»T)  and  (a)  follows  by  taking  T{F)  = 

argmin5x(F,  t). 

To  prove  (b)  write  7~*[(6/(l  “  ^)1  =  =  ‘^x(^O’0)>  where  Fq  = 

(1  -e)Fo  +  e^o-  For  all  F  e  F,  t  ^  R  and  s  >  0,  EfxK^  -O/-*]  >  (1  -^)EfoX[(-’^  - 
O/-*]  ^  (f  ~c)EfoX('^/'S)  =  Ef^*x(>V/s).  Therefore,  for  all  M-estimate  of  scale  based 
on  the  given  x  and  for  all  T  satisfying  the  assumptions  of  this  theorem  we  have, 

5-(v,r)  =  7-'(W(i-<)l  ° 
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Proof  of  Theorem  5.  Follows  directly  from  Theorem  4  and  Theorem  2  in  Martin 
and  Zamar  (1989)  □ 


Proof  of  Theorem  6.  Let  0  <  a  <  oo.  It  suffices  to  show  that,  for  each  A  >  0 


lim  sup  Pp  \ 

[sup5,.(f„)>5+(a)  +  Al 

[=0, 

(30) 

m-oo  1 

[n>m  J 

1 

and 

lim  sup  Pp  \ 

\  inf  5,.(Fn)  <  S-{a)  -  a] 

►  =0. 

(31) 

m-oo  p^p^  1 

l.n<m  J 

For  each  6  >  0  the  approximating  (even)  function  ^^(x)  is  defined  as 

f  ^ 

if  0  <  X  < 

a  —  8 

Ps{x)  =  i  1  - 

VI 

1 

<3 

TT 

1 

X  <  a 

(32) 

1 1 

if  X  >  a 

Notice  that  ps{x)  is  continuous  and  that  ps{x)  >  Xa(x)  for  all  x.  For  each  t  e  M 
and  all  F,  let 


^«(F,  t)  =  sup  {s  :  EppsKx  -  t)/sj  >  b(xa)}  •  (33) 

Clearly,  for  all  t  and  all  F  (including  the  empirical  c.d.f.  F„)  we  have 
^5(F,  t)  and  so 

S^An<^s{F)  =  mf^e{F,t).  (34) 

It  is  not  difficult  to  verify  that,  for  all  given  A  >  0  there  exists  5o  >  0  such  that 

St,  =  sup  S^{F)  <  S*(a)  +  (A/2).  (35) 

F€r, 

By  Theorem  1, 

lim  sup  Ff  |sup^«<,(F„)  >'^  >  5'^(a) +  (A/2)|  =  0.  (36) 

Now  (30)  follows  from  (33),  (34)  and  (35).  Finally,  (31)  can  be  proved  in  a  similar 
way,  using  the  approximating  function 

{0  if  0  <  X  <  a 

{x  —  a)/6  if  a  <  X  <  a 6  (37) 

1  if  X  >  a  +  6, 
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and  the  approximating  scale  functional 


=  inf  {s  :  EFP5[(a;  -  0/«]  <  ^(Xa)}  °  (38) 
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FIGURE  1. Maximum  Bias  Curves 
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FIGURE  #3  Mean-squared-error  curves  for  n=20,  f  N(0,1)  and  logarithmic  toss  function 
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FIGURE  #4.  Bias  curves  for  n=20,  F=N(0,1)  and  logarithmic  loss  function 
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FIGURE  #5.  Finite  Sample  (n=20)  and  asymptotic  maximum  bias  curve  tor  MAD  arxJ  SHORTH  tor  F=N(0,1)  and  logarithmic  toss. 
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FIGURE  #6.  Finite  Sample  (n=40)  and  asymptotic  maximum  bias  cun/e  for  MAD  and  SHORTH  for  F-N(0,1)  and  logarithmic  loss. 
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